Quiz on Colossus

Test your understanding of concepts related to the design of Colossus via a quiz.

Question 3

Bigtable says that it uses Colossus to store its data persistently. On the other hand, Colossus uses Bigtable to achieve scalability. Is this codependence logical?

image
Hide Answer

Well, it seems so. Colossus documentation doesn’t have a clear explanation of it. For most codependence instances, we often bootstrap the process by special means. We have speculatively described a possible solution in the following. Though we encourage you to think of alternative ways

Solution sketch: Bigtable uses a hierarchy to locate data shards. The first level is stored in another service called Chubby to bootstrap that lookup process. Usually, whenever someone writes into a Colossus, the associated metadata goes to a Bigtable instance. But when this specific Bigtable instance puts its data into Colossus, it especially puts the location metadata using a lean hierarchy (probably just one level) of shards. The main Colossus metadata can use many instances of such Bigtable to scale.

Details:

Let’s re-examine the progression from GFS to Colossus.

The GFS logically represents metadata and namespaces as a lookup table. The lookup table maps full pathnames to the metadata. With an increased volume of data requirements and small files, the metadata size became huge.

  • A single manager is not able to store such a large lookup table.
  • Searching from such a large table takes time and adds to the latency.
  • We can’t serve an increasing number of metadata requests simultaneously from the single manager.

Colossus needed to use multiple managers instead of a single manager to manage and serve metadata with low latency and high throughput to many clients. So, given that multiple managers (metadata nodes) are present, Colossus needs some partitioning logic to split the large metadata table across various managers. (They might be using some variant of consistent hashing, and the client can locate which manager to contact). Let’s call the group of managers (metadata nodes) a metadata cluster. As the nodes/managers leave or join the cluster, we need to rebalance the load among the nodes in the cluster. Rebalancing includes splitting or merging the metadata tables.

To have this partitioning logic for a large lookup table, splitting or merging tables to balance the load among multiple managers inside Colossus, Google exploits Bigtable, which implements all this. Maybe Colossus uses Bigtable as a black box, or maybe it uses one that is specialized for Colossus, but the central idea is the same.

image
In the illustration above, we call the nodes in the Bigtable cluster managers. Each manager has pointers for a set of metadata table partitions. For the sake of simplicity, we have shown two partitions only, one is called tablet 2, and the other 24. Manager 1 has pointer 2, which means it will serve the requests for tablet 2, and the manager 2 is responsible for serving requests for tablet 24. Each manager is searching from a small table so that the search is quick, and metadata operations can be performed quickly.

The managers in the metadata cluster (or the nodes in the Bigtable cluster) store tablets on the Colossus file system. The actual metadata on the tablets is stored in the same storage pool where Colossus stores data. The metadata for the tablet locations can again be stored in the metadata cluster (the Bigtable). Alternatively, the managers store the location of the tablets along with the pointer to the tablets.

3 of 3

Design and Evaluation of Colossus

Introduction to Tectonic